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We discuss how the apparently objective probabilities predicted by quantum mechanics can be 
treated in the framework of Bayesian probability theory, in which all probabilities are subjective. 
Our results are in accord with earlier work by Caves, Fuchs, and Schack, but our approach and 
emphasis are different. We also discuss the problem of choosing a noninformative prior for a density 
matrix. 
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I. INTRODUCTION 

Probability plays a central role throughout human af- 
fairs, and so everyone has an intuitive idea of what it 
is. Moreover, because of the extreme generality and 
widespread use of the concept of probability, it cannot 
be easily defined in terms of anything more basic. For 
example, the dictionary that I have in my office, Web- 
ster's Ninth New Collegiate, says that probability is "the 
state or quality of being probable" ; that to be probable 
is to be "supported by evidence strong enough to estab- 
lish presumption but not proof" ; and that presumption 
is "the ground, reason, or evidence lending probability to 
a belief". This is clearly unhelpful to anyone who does 
not already know what probability is. 

In mathematics and physics, we are often faced with a 
concept that is both simple enough to be clearly under- 
stood, and fundamental enough to resist definition; for 
example, a straight line in euclidean geometry. To make 
progress, we do not attempt to devise ever clearer defini- 
tions, but instead formulate axioms that our understood 
but undefined objects are postulated to obey. Then, us- 
ing codified rules of logical inference, we prove theorems 
that follow from the axioms. 

It is instructive to treat probability as one of these 
primitive concepts. Dispensing, then, with any attempt 
at definition, we say that the probability that a state- 
ment is true is a real number between zero and one. A 
statement may be true or false; if we know it to be true, 
we assign it a probability of one, and if we know it to 
be false, we assign it a probability of zero. If we do not 
know whether it is true or false, we assign it a probability 
between zero and one. 

There is typically no definitive way to make this as- 
signment. Different people could (and often do) assign 
different numerical values to the probability that some 
particular statement ("the stock price of Microsoft will 
be higher one year from now") is true. In this sense, 
probability is subjective. This point of view is Bayesian. 

Probability also enters quantum mechanics, in a seem- 
ingly more fundamental way. For example, given a 
wave function tp(x,t) for a particle in one dimension, 
the rules of quantum mechanics (which are apparently 



laws of nature) tell us that we must assign a probability 
\ip(x, t)\ 2 dx to the statement "at time t, the particle is 
between x and x + dx v . Different people do not appear 
to have a choice about this assignment. In this sense, 
quantum probability appears to be objective. 

The goal of this paper is to understand the how the ap- 
parently objective probababilities of quantum mechanics 
can be fit into the Bayesian framework, which allows dif- 
ferent people to make different probability assignments. 
This issue has been addressed before by Caves, Fuchs, 
and Schack []J, and our results are in broad agreement 
with theirs. However, we emphasize a somewhat differ- 
ent approach to certain issues that we will explain as we 
go along. 

In section m in order to fix the notation and key con- 
cepts, we briefly review the axioms and basic theorems 
of probability theory. In section 1 1 1 1 1 we introduce the 
notion of a probability of a probability, and explain how 
it can be applied to experimental data to turn an origi- 
nally subjective probability into an increasingly objective 
one, in the sense that all but strongly biased observers 
agree with the final probability assignment. In section 
IIVI we apply this formalism to the probabilities of quan- 
tum mechanics. In section we discuss when and why 
it is preferable to assign probabilities to possible density 
matrices for a quantum system, rather than assigning 
a particular density matrix. In section IVII we discuss 
the construction of noninformative prior distributions for 
density matrices. We summarize and conclude in section 
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II. THE AXIOMS OF PROBABILITY 

The statements to which we may assign probabilities 
must obey a logical calculus. Some key definitions (in 
which "iff" is short for "if and only if" ) : 

S — a statement. 

Q = a statement known to be true. 
= a statement known to be false. 
S = a statement that is true iff S is false. 
Si V 5*2 = a statement that is true iff either Si or 5*2 
is true. 

5*1 A S*2 = a statement that is true iff both Si and S2 
are true. 

Si and S2 are mutually exclusive iff Si A 62 = 0. 
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Si, . ■ ■ , S n are a complete set iff Si V . . . V S n — and 
S; A Sj = for i ^ j. 

Elementary logical relationships among statements in- 
clude S V 5 = n, S A 5 = 0, S A 17 = S, Si A (S 2 V S 3 ) = 
(Si A S2) V (Si A S3), etc. Denoting the probability as- 
signed to a statement S as P(S), we can state the first 
three axioms of probability. 

Axiom f . P(S) is a nonnegative real number. 

Axiom 2. P(S) — 1 iff S is known to be true. Axiom 
3. If Si and S2 are mutually exclusive, then P(Si V S2) = 
P(Si) + P(S 2 ). 

From these axioms, and the logical calculus of state- 
ments, we can derive some simple lemmas: 

Lemma 1. P(S) = 1 - P(S). 

Lemma 2. P(S) < 1. 

Lemma 3. P(S) = iff S is known to be false. 

Lemma 4. P(Si A S 2 ) = P(Si) + P(S 2 ) - P(Si V S 2 ). 
We omit the proofs, which are straightforward. 

We will also need the notion of a conditional statement 
S 2 |Si. S 2 |Si is a statement if and only if Si is true; oth- 
erwise, S 2 |Si is not a statement, and cannot be assigned 
a probability. Given that Si is true, the statement S 2 |Si 
is true if and only if S 2 is true. The probability that 
S 2 |Si is true is then specified by 

Axiom 4. P(S 2 |Si) = P(Si A S 2 )/P(Si). 
Note that, if P(Si) = 0, then Si = by Lemma 3, 
and so both sides of Axiom 4 are undefined: the right 
side because we have divided by zero, and the left side 
because S 2 |0 is not a statement. 

Another concept we will need is that of independence 
between statements. Two statements are said to be in- 
dependent if the knowledge that one of them if true tells 
us nothing about whether or not the other one is true. 
Thus, if Si and S 2 are independent, we should have 
P(Si|S 2 ) = P(Si) and P(S 2 |Si) = P(S 2 ). Using these 
relations and Axiom 4, we get a result that can be used 
as the definition of independence, 

Si and S 2 are independent if and only if P(Si A S 2 ) = 
P(Si)P(S 2 ). 

Note that independence is a property of probability as- 
signments, rather than the statements themselves. Thus, 
people can disagree on whether or not two statements are 
independent. 



III. PROBABILITIES OF PROBABILITIES 

What limitations, if any, should be placed on the na- 
ture of statements to which we are allowed to assign prob- 
abilities? 

There are various schools of thought. Frequentists as- 
sign probabilities only to random variables, a highly re- 
stricted class of statements that we shall not attempt to 
elucidate. Bayesians allow a wide range of statements, 
including statements about the future such as "when this 
coin is flipped it will come up heads," statements about 
the past such as "it rained here yesterday," and time- 
less statements such as "the value of Newton's constant 



is between 6.6 and 6.7 x 10~ n m 3 /kgs 2 ." Some level of 
precision is typically insisted on, so that, for example, 
"red is good" might be rejected as too vague. 

A major thesis of this paper is that the class of al- 
lowed statements should include statements about the 
probabilities of other statements. Some Bayesians (for 
example, de Finetti Q) reject this concept as meaning- 
less. However, it has found some acceptance and utility 
in decision theory, where it is sometimes called a sec- 
ond order probability; see, e.g., Q. In particular, it is 
an experimental fact that people's decisions depend not 
only on the probabilities they assign to various alterna- 
tives, but also on the degree of confidence that they have 
in their own probability assignments This degree of 
confidence can be quantified and treated as a probability 
of a probability. 

To illustrate how we will use the concept, consider the 
following problem. Suppose that we have a situation with 
exactly two possible outcomes (for example, a coin flip). 
Call the two outcomes A and B. In the terminology of 
the logical calculus, A V B = Q, and A A B = 0, so that A 
and B are a complete set. The probability axioms then 
require P(A) + P(B) = 1, but do not tell us anything 
about either P(A) or P(B) alone. 

In the absence of any other information, we invoke 
Laplace's principle of insufficient reason (also called the 
principle of indifference): when we have no cause to pre- 
fer one statement over another, we assign them equal 
probabilities. Thus we are instructed to choose P(A) = 
P(B) = i. While this assignment is logically sound, we 
clearly cannot have a great deal of confidence in it; typ- 
ically, we are prepared to abandon it as soon as we get 
some more information. 

Another (and, we argue, better) strategy is to retreat 
from the responsibility of assigning a particular value to 
P(A), and instead assign a probability P{H) to the state- 
ment H = "the value of P(A) is between h and h + dh." 
Here dh is infinitesimal, and < h < 1. Then P{H) 
takes the form p(h)dh, where p(h) is a nonnegative func- 
tion that we must choose, normalized by p(h)dh = 1. 
We might choose p(h) = 1, for example. 

Now suppose we get some more information about A 
and B. Suppose that the situation that produces cither 
A or B as an outcome can be recreated repeatedly (each 
repetition will be called a trial), and that the outcomes of 
the different trials are (we believe) independent. Suppose 
that the result of the first N trials is Na A's and Nb B's, 
in a particular order. What can we say now? 

The formula we need is 

Bayes' Theorem. P(H\D) = P(D\H)P(H)/ P(D). 
Bayes' theorem follows immediately from Axiom 4; since 
H A D is the same as D A H, we have P(H\D)P(D) = 
P(H AD) = P(D\H)P(H). While H and D can be any 
allowed statements, the letters are intended to denote 
"hypothesis" and "data". Bayes' theorem tells us that, 
given a hypothesis H to which we have somehow assigned 
a prior probability P(H) (whether by the principle of in- 
difference, or by any other means), and we know (or can 
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compute) the likelihood P(D\H) of getting a particular 
set of data D given that the hypothesis H is true, then 
we can compute the posterior probability P(H\D) that 
the hypothesis H is true, given the data D that we have 
obtained. Furthermore, if we have a complete set of hy- 
potheses Hi, then we can express P(D) in terms of the as- 
sociated likelihoods and prior probabilities: starting with 
D = DAfl = DA(H 1 VH 2 V. . .) = (DAH 1 )V(DAH 2 )V . . . , 
and noting that DA Hi and DAHj are mutually exclusive 
when i ^ j, we have 

P(D) = P(D A Hi) = PiDlHjPiHt), (1) 

i i 

where the first equality follows from Axiom 3, and the 
second from Axiom 4. 

To apply these results to the case at hand, recall that 
our hypothesis is H — U P(A) is between h and h + dh" . 
We have assigned this hypothesis a prior probability 
P{H) = p(h)dh. The data D is a string of N A A's and 
Nb B's, in a particular order; each of the N — Na + Nb 
outcomes is assumed to be independent of all the oth- 
ers. Using the definition of independence, we see that 
the likelihood is 

P(D\H) = P{A) Na P{B) Nb 

= h NA (l-h) NB . (2) 

Applying Bayes' Theorem, we get the posterior probabil- 
ity 

P{H\D) = P{D)~ l h NA {l - h) NB p(h)dh, (3) 

where 

P{D)= f h NA {l-h) NB p{h)dh. (4) 
J o 

If the number of trials N is large, and if the prior proba- 
bility p{h) has been chosen to be a slowly varying func- 
tion, then the posterior probability P(H\D) has a sharp 
peak at h = h cxp = Na/N, the fraction of trials that 
resulted in outcome A. The width of this peak is propor- 
tional to N^ 1 / 2 if both Na and Nb are large, and to TV -1 
if either Na or Nb is small (or zero). Thus, after a large 
number of trials, we can be confident that the probabil- 
ity P(A) that the next outcome will be A is close to the 
fraction of trials that have already resulted in A. The 
only people who will not be convinced of this are those 
whose choice of prior probability p(h) is strongly biased 
against the value h = /i cxp . Thus, the value h exp for the 
probability h is becoming objective, in the sense that al- 
most all observers agree on it. Furthermore, those who 
do not agree can be identified a priori by noting that 
their prior probabilities are strong functions of h. 

Those who reject the notion of a probability of a proba- 
bility, but who accept the practical utility of this analysis 
(which was originally carried out by Laplace), have two 
options. Option one is to declare that h is not actually a 



probability; it is rather a limiting frequency or a propen- 
sity or a chance. Option two is to declare that p(h)dh is 
not actually a probability; it is a measure or a generating 
function. 

Let us explore option two in more detail. Rather than 
assigning a second-order probability to H = U P(A) is 
between h and h + dh v , we assign a probability to ev- 
ery finite sequence of outcomes; that is, we choose values 
for P(A), P(B), P{AB), P(BA), P(AAA), P(AAB), 
and so on, for strings of arbitrarily many outcomes. We 
assume that all possible strings of N outcomes form 
a complete set. Our probability assignments must of 
course satisfy the probability axioms, so that, for exam- 
ple, P(A) + P(B) = 1. We also insist that the assign- 
ments be symmetric; that is, independent of the ordering 
of the outcomes, so that, for example, 

P(AAB) = P(ABA) = P(BAA). (5) 

Furthermore, the assignments for strings of N outcomes 
must be consistent with those for N + 1 outcomes; this 
means that, for any particular string of N outcomes S, 

P(S) = P(SA) + P(SB). (6) 

A set of probability assignments that satisfies these re- 
quirements is said to be exchangeable. Then, the de 
Finetti representation theorem |4j states that, given an 
exchangeable set of probability assignments for all possi- 
ble strings of outcomes, the probability of getting a spe- 
cific string D of N outcomes that includes exactly Na 
A's and Nb -B's can always be written in the form 

P(D)= [ h NA (l-h) NB p(h)dh, (7) 
Jo 

where p(h) is a unique nonnegative function that obeys 
the normalization condition J Q dhp(h) = 1, and is the 
same for every string D. Note that eq. J7J) is exactly the 
same as eq. (QJ. Thus an exchangeable probability as- 
signment to sequences of outcomes can be characterized 
by a function p(h) that can be (as we have seen) consis- 
tently treated as a probability of a probability. But those 
who find this notion unpalatable are free to think of p{h) 
as specifying a measure, or a generating function, or a 
similar euphemism. 

To summarize, if we need to assign a prior probability 
but have little information, it can be more constructive 
to abjure, and instead assign a probability to a range 
of possible values of the needed prior probability. This 
probability of a probability can then be updated with 
Bayes' theorem as more information comes in. 

IV. PROBABILITY IN QUANTUM 
MECHANICS 

Suppose we are given a qubit: a quantum system with 
a two-dimension Hilbert space. (We will use the language 
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appropriate to a spin-one-half particle to describe it.) We 
are asked to make a guess for its quantum state. 

Without further information, the best we can do is 
invoke the principle of indifference. In the case of a finite 
set of possible outcomes, this principle is based on the 
permutation symmetry of the outcomes; we choose the 
unique probability assignment that is invariant under this 
symmetry. The quantum analog of the permutation of 
outcomes is the unitary symmetry of rotations in Hilbert 
space. The only quantum state that is invariant under 
this symmetry is the fully mixed density matrix 



p=il. 



(8) 



Thus we are instructed to choose eq. © as the quantum 
state of the system. While this assignment is logically 
sound, we clearly cannot have a great deal of confidence 
in it; typically, we are prepared to abandon it as soon as 
we get some more information. 

Another (and, we argue, better) strategy is to retreat 
from the responsibility of assigning a particular state 
(pure or mixed) to the system, and instead assign a prob- 
ability P(H) to the statement H = "the quantum state 
of the system is a density matrix within a volume dp cen- 
tered on p" , where p is a particular 2x2 hermitian matrix 
with nonnegative eigenvalues that sum to one, and dp is 
a suitable differential volume element in the space of such 
matrices. We can parameterize p with three real numbers 
x, y, and z via 



P 



1 + z x - 
x + iy 1 



- z 



(9) 



where where x 2 + y 2 + z 2 



r 2 < 1. We then take 



dp = dV, where dV = (3/4Tr)dx dy dz is the normal- 
ized volume element: J dV = 1. P(H) takes the form 
p(p)dV, where p(p) is a nonnegative function that we 
must choose, normalized by J p(p)dV — 1. We might 
choose p(p) = 1, for example. 

Now suppose we get some more information about the 
quantum state of the system. Suppose that the proce- 
dure that prepares the quantum state of the particle can 
be recreated repeatedly (each repetition of this will be 
called a trial), and that the outcomes of measurements 
performed on each prepared system are (we believe) inde- 
pendent. Suppose further that we have access to a Stern- 
Gerlach apparatus that allows us to measure whether the 
spin is + or — along an axis of our choice. We choose 
the z axis. Suppose that the result of the first N trials 
is N + +'s and AL_ — 's. What can we say now? 

Given a density matrix p, parameterized by eq. ffl . the 
rules of quantum mechanics tell us that the probability 
that a measurement of the spin along the z axis will yield 
+1 is 



P(a z = +l\p) = TrUl + a z )p=Ul + z), 



(10) 



where a z is a Pauli matrix, and the probability that this 
measurement will yield —1 is 



Now we use Bayes' theorem. Our hypothesis is H = 
"the quantum state is within a volume dp centered on 
p" . We have assigned this hypothesis a prior probability 
P(H) — p(p)dp. The data D is a string of N+ +'s and 
N- — 's, in a particular order; each of the N = N + + N- 
outcomes is assumed to be independent of all the others. 
Using the definition of independence, we see that the 
likelihood is 

P(D\H) = [P(a z = +l\p)] N +[P(a z =+l\p)] N - 

= [h(i+*)] N+ ll(i -*)]"-• (i 2 ) 

Applying Bayes' Theorem, we get the posterior probabil- 
ity 

P(H\D) = P(D)- 1 [^l+z)] N ^(l-z)} N -p(p)dp, (13) 
where 



P{D) 



i(i+z)} N +m-z)} N - P (p)dp. (M) 



When the number of trials N is large, and the prior prob- 
ability p(p) is a slowly varying function, the posterior 
probability P(H\D) has a sharp peak at z — z cxp = 
(N + — N-)/N. Thus, after a large number of trials in 
which we measure o~ z , we can be confident of the value of 
the parameter z in the density matrix of the system. The 
only people who will not be convinced of this are those 
whose choice of prior probability p(p) is strongly biased 
against the value z = z xp- Furthermore, those who do 
not agree can be identified a priori by noting that their 
prior probabilities are strong functions of p. 

We can of course orient our Stern-Gerlach apparatus 
along different axes. If we choose the x axis or the y axis, 
the relevant predictions of quantum mechanics are 



P(a x = 


+l|p) 


= Tr|(l 


! <Jx)P = 




+ x), 


P(a x = 


-1|P) 


= 


- <T x )p 


1(1 


- x), 


P(a y = 




= Trl(l 


■ Vy)p = 


1(1 


+ y), 


P(a y = 


-MP) 


= Tr|(l 


-°y)P = 


1(1 


-y)- 



P(<Jz 



-l|p)=Tr|(l-a z )p=|(l-^. 



(11) 



For each trial, we can choose whether to measure a Xl o~ y , 
or <j z . (We could also choose to measure along any other 
axis.) Then, if the outcomes include N+ z measurements 
of a z with the result o~ z = +1, and so on, the posterior 
probability becomes 

P(H\D) = P(D)-^(l + x)] N +*[l(l-x)] N -* 

x[±(l + z)] N +*[l(l-z)} N -*p(p)dp, (19) 

where P(D) is given by the obvious integral. Clearly 
the discussion in the preceding paragraph is simply trip- 
licated, and, when the number of trials is large, we have 
determined the entire density matrix to the satisfaction 
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of all but strongly biased observers. Our subjective prob- 
abilities of probabilities have led us to an objective con- 
clusion about quantum probabilities. 

In 0, Caves et al arrived at an essentially identical 
result. The main difference in their analysis is that they 
regarded p(p)dp as a measure rather than a probability. 
This approach required them to prove, first, a quantum 
version of the de Finetti theorem and, second, that 
Bayes' theorem can be applied to p(p)dp @- Both steps 
become unnecessary if we treat p{p)dp as, fundamentally, 
a probability. 



V. PROBABILITIES FOR DENSITY MATRICES 
VS. DENSITY MATRICES 

If we assign an impure density matrix p to a quantum 
system, does this not already take into account our igno- 
rance about it? Why is it preferable to assign, instead, a 
probability p{p)dp to the set of possible density matrices? 

It depends on the nature of our ignorance. Suppose, 
for example, the system is the spin of an electron plucked 
from the air. Then we expect that eq. JSJ will describe 
it, in the sense that if we do repeated trials (plucking 
a new electron each time, and measuring its spin along 
an axis of our choice) , we will find that x eX p = (N+ x — 
N- X )/(N +X + N- X ), 2/cxp = (N +y -N- y )/(N +y +N- v ), 
and z cxp ee (N +z - N^ Z )/(N +Z + N_ z ) all tend to zero. 

Suppose instead that the spin is prepared by a techni- 
cian who (with the aid of a Stern-Gerlach device) puts it 
in either a pure state with o~ z = +1, or a pure state with 
a x = +1, and each time decides which choice to make 
by flipping a coin that we believe is fair. In this case the 
appropriate density matrix is 

P = §[§(1 +*,)] + §[§(! + *»)] 

-K?0- (20) 

Comparing with eq. we see that we now we expect 
x C x P , ?/oxp, and z cxp to approach +i, 0, and +i, respec- 
tively. 

Now suppose that the spin is prepared by a technician 
who puts it in either a pure state with a z = +1, or a pure 
state with u x — +1, and makes the same choice every 
time. We, however, are not aware of what her choice is. 

If forced to assign a particular density matrix, we 
would have to choose eq. I|2U[) . However, our situation 
is clearly different from what it was in the previous ex- 
ample. In the present case, repeated experiments would 
not verify eq. J50J), but would instead converge on either 
Xcxp = and z CX p = +1, or x cxp = +1 and z cxp = 0. 
Therefore, in this case, it is more appropriate to assign a 
prior probability of one-half to p = i(l + cr z ) and a prior 
probability one-half to p — |(1 + a x ). Then, as data 
comes in, we can update these probability assignments 
with Bayes' theorem, as described in section llVl 



Thus, it is better to choose p(p)dp when it is possible 
that there is something about the preparation procedure 
that consistently prefers a particular direction in Hlibert 
space, but we do not know what that direction is. Since 
this possibility can rarely be ruled out a priori, we are 
typically better served by choosing a prior probability 
p(p)dp, rather than a particular vaue of p itself. 

VI. NONINFORMATIVE PRIORS FOR 
DENSITY MATRICES 

Suppose we have decided to choose a prior probability 
p(p)dp for the density matrix p of some quantum system. 
How should we choose this probability? 

In the case where we have little or no information about 
the quantum system, we would like to formulate the ap- 
propriate analog of the principle of indifference. Consider 
a qunit, a quantum system whose Hilbert space has di- 
mension n that is known to us. (We will not consider the 
even more general problem where n is unknown.) We 
can always write the density matrix (whatever it is) in 
the form 

p = U- 1 pU, (21) 

where U is unitary with determinant one, and p is di- 
agonal with nonnegative entries p\ , . . . , p n that sum to 
one. There is a natural measure for special unitary ma- 
trices, the Haar measure; it is invariant under U — * CU, 
where C is a constant special unitary matrix. In the 
simplest case of n = 2, we can parameterize U as 
U = e mi<T3 e M2CT2 e M3CT3 , with < a x < tt, < a 2 < tt/2, 
< Q!3 < 7r; then the normalized Haar measure is 
dU = 7r~ 2 s\n.{2ct2)daidoi2dot3. This construction is ex- 
tended to all n in Q- 

Suppose we know that the state of the quantum system 
is pure. Then we can set pij = SuSji, and parameterize 
p via U. Then it is natural to choose dp = dU and 
p(p) = 1, because this is the only choice that is invariant 
under unitary rotations in Hilbert space. 

Now consider the more general case where we do not 
have information about the purity of the system's quan- 
tum state. Following we define the volume element 
via 

dp ee dUdF, (22) 

where dU is the normalized Haar measure for U , and 

dF= {n-l)\6(p 1 + ...+p n -l)dp 1 ...dp n (23) 

is a normalized measure for the piS that we will call the 
Feynman measure (because it appears in the evaluation 
of one-loop Feynman diagrams). Eq. ll'MI) assumes that 
each pi runs from zero to one; then eq. I|21ll is an over- 
complete construction, because U can rearrange the p^s. 
This is easily fixed by imposing pi > . . . > p n , and mul- 
tiplying dF by nl. However, eq. (|23|l as it stands is easier 
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to write and think about; the overcompleteness of this 
construction of p causes no harm. 

In the case n = 2, we previously chose dp = dV = 
(3/4:iT)dx dy dz for the parameterization of eq. @. In this 
case, the eigenvalues of p are ^(1 + r) and ^(1 — r), with 
< r < 1. After integrating over U, dV — > 3r 2 dr; in 
comparison, di* 1 = dr for this case. 

The purity of a density matrix p can be paramertized 
by Tr p 2 , which for n — 2 is | ( 1 + r 2 ) . Thus the volume 
measure dV is more biased towards pure states than is the 
Feynman measure dF; we have dV = 3(2 Tr p 2 — l)dF. 

In general, we can accomodate any such bias by taking 
p(p)dp to be of the form 

p{p)dp = p{Tr p 2 )dUdF, (24) 

where p{x) is an increasing function if we are biased to- 
wards having a pure state. For n > 2, we can take p to 
be a function of Tr p k for 2 < k < n. Arguments in favor 
of various choices of p have been put forth , but no 
single choice seems particularly compelling. Of course, 
once we have done enough experiments, our original bi- 
ases become largely irrelevant, as we saw in section ITVl 

VII. CONCLUSIONS 

We have argued that, in a Bayesian framework, the 
nature of our ignorance about a quantum system can of- 
ten be more faithfully represented by a prior probability 



p{p)dp over the range of allowed density matrices, rather 
than by a specific choice of density matrix. This method 
is particularly appropriate when (1) the preparation pro- 
cedure may favor a direction in Hilbert space, but we do 
not know what that direction is, and (2) we can recre- 
ate the preparation procedure repeatedly, and perform 
measurements of our choice on each prepared system. In 
this case, as data comes in, we use Bayes' theorem to up- 
date p(p)dp. Eventually, all but strongly biased observers 
(who can be identified a priori by an examination of their 
choice of prior probability) will be convinced of the values 
of the quantum probabilities. In this way, initially sub- 
jective probability assignments become more and more 
objective. 

In choosing p(p)dp, we can use the principle of indiffer- 
ence, applied to the unitary symmetry of Hilbert space, 
to reduce the problem to one of choosing a probability 
distribution for the eigenvalues of p. There is, however, 
no compelling rationale for any particular choice; in par- 
ticular, we must decide how biased we are towards pure 
states. 
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